Bernoulli 18(4), 2012, 1405-1420 
DOI: 10.3150/11-BEJ381 

Convergence of the largest eigenvalue of 
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Let X p = (si, . . . ,s n ) = (Xij)p Xn where Xy's are independent and identically distributed (i.i.d.) 
random variables with EXu = 0, EX\\ = 1 and EXfi < oo. It is showed that the largest eigen- 
value of the random matrix A p = 2 J_ (X p X p — nl p ) tends to 1 almost surely as p — > oo, n — > oo 
with p/n — s- 0. 
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1. Introduction 

Consider the sample covariance type matrix S = ^X p X p , where X p = (si,...,s n ) = 
(Xij) pxn and Xij,i = l,...,p,j = 1, ...,n, are i.i.d. random variables with mean zero 
and variance 1. For such a matrix, much attention has been paid to asymptotic proper- 
ties of its eigenvalues in the setting of p/n— > c > as p — > oo and n — > oo. For example, 
its empirical spectral distribution (ESD) function F s (x) converges with probability one 
to the famous Marcenko and Pastur law (see [9] and [8]). Here, the ESD for any matrix 
A with real eigenvalues Ai < A2 < • • • < A p is denned by 

F A (x) = -#{1: \ < x}, 
P 

where #{•■■} denotes the number of elements of the set. Also, with probability one 
its maximum eigenvalue and minimum eigenvalue converge, respectively, to the left end 
point and right end point of the support of Marcenko and Pastur's law (see [7] and [3]). 

In contrast with asymptotic behaviors of S in the case of p/n —> c, the asymptotic 
properties of S have not been well understood when p/n— > 0. The first breakthrough 
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was made in Bai and Yin [2] . They considered the normalized matrix 



a p = 7r-^( X p x 'p ~ nI p) 
2^/np 



and proved with probability one 

F Ap -+F(x), 

which is the so-called semicircle law with a density 

2 

F'(x) 



VI - x 2 , if |x| < 1, 
if Id > 1. 



One should note that the semicircle law is also the limit of the empirical spectral distribu- 
tion of a symmetric random matrix whose diagonal are i.i.d. random variables and above 
diagonal elements are also i.i.d. (see [10]). Second, when X\i ~ iV(0, 1), El Karoui [5] 
proved that the largest eigenvalue of X p X p after properly centering and scaling con- 
verges to the Tracy— Widom law. 

In this paper, for general In, we investigate the maximum eigenvalue of A p under the 
setting oi p/n — > as p — > oo and n — > oo. The main results are presented in the following 
theorems. 

Theorem 1. Let X p = (Xij) pxn where {Xij: i = 1, 2, . . . ,p; j = 1,2, . . -,n} are i.i.d. real 
random variables with EX%i = OjEX^ = 1 and EXf x < oo. Suppose that n = n{p) — > oo 
and p/n— > as p — > co. Define 

A p = (Aij)p X p = - — (XpX p — nip). 
Zy/np 



Then as p — > oo 

A ma x(Aj,) ->• 1 a.s., 
where A max (A p ) represents the largest eigenvalue of A p . 

Indeed, after truncation and normalization of the entries of the matrix A p , we may 
obtain a better result. 

Theorem 2. Let n = n{p) — > oo and p/n — > as p — > oo. Define a p x p random ma- 

A p = (Aij)p X p = —— — (XpX p — nip), 
A^/np 



where X p = (Xij) pxn . Suppose that Xy 's are i.i.d. real random variables and satisfy the 
following conditions 
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(1) EX U = 0, EX\ X = l,EXf x < oo and 

(2) \Xij | < Sptfnp, where 8 P \.Q, but 8 p ^fnp t +oo, asp— >oo. 
Then, for any e > 0, £ > 

p(A max (A p ) > 1 + e) = o(p~ e ). 

So far we have considered the sample covariance type matrix S. However, a common 
used sample covariance matrix in statistics is 

1 ™ 

Si = -5^(B J -8)(B i -s) / , 
3=1 

where 




j'=i 



Similarly we renormalize it as 

Theorem 3. Under assumptions of Theorem 1, asp— > oo 

Amax(A p i)^l a.s., 
where A max (A p i) stands for the largest eigenvalues of A p ±. 

Estimating a population covariance matrix for high dimension data is a challenging 
task. Usually, one can not expect the sample covariance matrix to be a consistent estimate 
of a population covariance matrix when both p and n go to infinity, especially when the 
orders of p and n are very close to each other. In such circumstance, as argued in [4], 
operator norm consistent estimation of large population covariance matrix still has nice 
properties. 

Suppose that E is a population covariance matrix, nonncgativc definite symmetric 
matrix. Then S 1 / 2 Sj,j = 1, . . . ,n, may be viewed as i.i.d. sample drawn from the popula- 
tion with covariance matrix E, where (E 1 / 2 ) 2 = E. The corresponding sample covariance 
matrix is 

n 

S 2 = i ^(E 1 / 2 s, - E 1 / 2 s)(E 1 / 2 s, - E 1 /^)'. 

Theorem 3 indicates that the matrix S2 is an operator consistent estimation of E as long 
as pjn — > when p — > 00. Specifically, we have the following theorem. 
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Theorem 4. In addition to the assumptions of Theorem 1, assume that ||S|j is bounded. 
Then, as p — > oo 

||S 2 -S||=0 
where \\ ■ \\ stands for the spectral norm of a matrix. 

Remark 1. Related work is [1], where the authors investigated quantitative estimates 
of the convergence of the empirical covariance matrix in the Log-concave ensemble. Here 
we obtain a convergence rate of the empirical covariance matrix when the sample vectors 
are in the form of S 1 / 2 s :) . 

Remark 2. Theorems 1-4 are stated for the real random matrix X p , but they also hold 
for the complex case under moment conditions EXu = 0,-E|Xli| 2 = 1 and -E|Xn| 4 < oo. 
The proofs are similar to those for the real case except some notation changes. 




2. Proof of Theorem 1 



Throughout the paper, C denotes a constant whose value may vary from line to line. 
Also, all limits in the paper are taken as p — > oo. 
It follows from Theorem in [2] that 

liminf A max (A p ) > 1 a.s. (1) 

Thus, it suffices to show that 

limsupA max (A p ) < 1 a.s. (2) 

Let A p = 2^p(X p Xp - nip), where X p = (Xy) px „ and X l} = Xy/(LYy| < Sptfrvp) 
where 6 P is chosen as the larger of 5 P constructed as in (3) and S p as in (5). On the one 
hand, since EX^ < oo for any <5 > we have 

lim S- 4 E\X n \ 4 I(\X n \ > 5^) = 0. 

p— >oo 

Since the above is true for arbitrary positive 5 there exists a sequence of positive S p such 
that 

lim(L = 0, lim 6- 4 E\X 11 \ 4 I(\X n \>6 p yhp) = 0, cL^t+oo. (3) 

p— >oo p— >oo y 

On the other hand, since EX^ < oo for any v > 

< oo. 

fe=i 
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In view of the arbitrariness of i/, there is a sequence of positive number t>k such that 

oo 

j/ fc ^0, as fc^oo, ^22 k P{\X 11 \>v k 2 k/i )<oo. (4) 

k=l 

For each k, let be the maximum p such that n(p) ■ p < 2 . For p k -i <p<Pk, set 

5 p = 2v k . (5) 

Let Z t = Xij,t = (i — l)n + j and obviously {if t } are i.i.d. We then conclude from (4) 
and (5) that 



P(A p ^A p ,i.o.)< lirn^pl |J |J |J {p^|>^^} 

\k=K p k -!<p<pk i<p,j<n J 



fc=RT \Pk-i<P<Pfc t=l > 



AT— >oo 

fc=_ff \t=l 



< lim V 2 k P(\Z 1 \ >v k 2 k/4 ) 

K->oo — ' 
k=K 

— a.s. 



It follows that A max (A p ) — A max (A p ) —> a.s. as p — > oo. 

From now on, we write S for S p to simplify notation. Moreover, set A p = 2 J_ (X p Xp — 

nip), where X p = (Xy) px „ and -Xy- = _ *iz^_ 3i , Here, ct 2 = £(ln - £Xn) 2 and cr 2 — > 1 
as p — » oo. 

We obtain via (3) 

(6) 



and 



We conclude from the Rayleigh-Ritz theorem that 

l^max(Ap) ^max(Ap)| 



1 

< 



2Jnp 



P n 



sup ( z i z o ****** + J2 z iJ2^ - !) 

fe=l i=l fe=l y 



z||=l 
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p n 



sup z ^ E x*Xik + E z * 2 E - 

IMI- 1 fc=i 



i=i fe=i 



< 



1- — 



sup 

l|z|| = l 



1 2|£X U | 
ttt= r 2 — sup 

2^/np a A || z || =1 

1 n\EX xl \ 2 

2 SUp 



^ n p n 

E^-^E^A +E^E(^ - 1) 

v ^ fe=l 
p p n 

EE*^E* 



i=l fc=l 



i=l i=l fc=l 

p p 

EE z ^- 

«=i j=i 



2^np cr- 
= + A 2 + A 3 + A A . 
By (7) and the strong law of large numbers, we have 



2Jnp 



Ay = 



sup 



p n 



< 



M=1 

\a 2 -l\^hp 1 
2<T 2 np 



£ -E^ +E^ 2 E(^- 1 ) 



fc=i \ \t=i 

n p 



^ |a 2 -l|^p 1 ^ 



EE* 

fc=l i=l 
n p 

EE* 



i=l 
p n 



i=l 



i=l k=l 



fc=l i=l 



rip 



2<T 2 np 
-t a.s. 

Similarly, (6), Holder's inequality and the strong law of large numbers yield 



1 2\EX ll \ 
A 2 < —-= - 2 — sup 

2^/np (T z || Z || =1 



E- 



E*E* 

t=i k=i 



< 



i 



c 



< 



2^/np CT 2 (np) 3 / 4 

1 C 

' er 2 (np) 3 / 4 



/ p / n \ 2 \ V2 



p n 



1/2 



< 



c 



< 



C 

4# 



^ p n 

ip E E * 

^ P n 

EE* 



i=i k=i 

1/2 



i=l fc=l 



1/2 



a.s. 
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It is straightforward to conclude from (6) and (7) that 

A 3 ^0 a.s., A 4 ^0 a.s. 

Thus, we have A max (A p ) — A max (A p ) — > a.s. By the above results, to prove (2), it is 
sufficient to show that limsupp^^ A max (A p ) < 1 a.s. To this end, we note that the matrix 
A p satisfies all the assumptions in Theorem 2. Therefore, we obtain (2) by Theorem 2 
(whose argument is given in the next section). Together with (1), we finishes the proof 
of Theorem 1 . 



3. Proof of Theorem 2 



Suppose that z = (zi,...,z p ) is a unit vector. By the Rayleigh-Ritz theorem, we then 
have 



Amax(Ap) = max f 



= max I \^Z{ZjAij + zfAj. 

" Z " i=l 

< A max (B p ) +max| J 4j i |, 

2<J> 



where B p = (Bij) pxp with 



0, 



if i = j, 



Bin = 



1 



k=l 



To prove Theorem 2, it is sufficient to prove, for any e > 0,£ > 

P(A max (B p )>l + e) = o(p- / ) 

and 

1 



P max ■ 



i< P Jnp 



>e \=o(p- 1 ). 



(8) 



(9) 



We first prove (9). To simplify notation, let Y 3 = - 1 and G x = £|Fi| 2 . Then 
EYj = 0. Choose an appropriate sequence h = h p such that it satisfies, as p — > oo 



h/logp — > oo, 
<5 2 /i/logp^0, 



(10) 
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We then have 

/ 1 

P max 



3=1 



> e 



3=1 



E^' 



3=1 
h/2 



< e -Vv^)-' l E E E 



< e -V(^)^E E 



m=l l<j'i<j3<j'm<nii+i2H Mm=?» 

ii>2 t!>2 

fc/2 



m=l ii + t 2 H H m ='i 

ii>2,...,ii>2 

h/2 



m!(n — rra)! iiU'2! • ■ ■ im! 



^lYiP^iyiP 2 ••■ J B|yi| i 



< e -Mv^)"' l E E 



/1/2 



■m— 1 ii +'i 2 H him — 

U>2 u>2 



c-4 \ — rn 

dpi -of, . Al 



rn — 1 



5 2h < e~ h p 



< 



ph\ 1/h 2fh -i\ H K f^\' 1 
\ogp 



S 2 h V 

Viog(*v^i)y 



where £ is a constant satisfying < £ < e. Below are some interpretations of the above 
inequalities: 

(a) The fifth inequality is because, m! ( t "l Tra )i < n m , \Yi\ < S 2 - S fnp. 

(b) We use the fact V- , , ■ , , ■ . . ^„ . ^„ . ,. ^ ! . . < m h in the sixth inequality. 

(c) The seventh inequality uses the elementary inequality 

b , 



\\oga 



for all a > 1, b > 0, t > 1 and > 1. 

log a 



(d) The last two inequalities are due to (10). 

(c) With the facts that £ < 1, h/lo gp — > 00, the last equality is true. 



Thus, (9) follows. 
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Next, consider (8). For any ? > 0, we have 



P(A 

max (Bp) >! + ?)< 



E\ k (BJ ^r(B^) 



l l 




< 



where k — k p satisfies, as p — > oo 



k/\ogp — > oo, 



<5 1/3 fc/logp^0, 




and the summation is taken with respect to j\,ji-,----,jk running over all integers in 
{1, 2, . . . , n} and ii, *2j •--»**; running over all integers in {1,2,... ,p} subject to the con- 
dition that ii ^ «2, i% 7^ 13, ■ ■ ■ , ik ^ h ■ 

In order to get an up bound for | ^ J EXi 1 j 1 Xi 2 j 1 ■ ■ ■ Xi k j k Xi 1 j k |, we need to construct 
a graph for given i\,.,.,ik and j\, . . . , jk, as in [7, 11] and [3]. We follow the presentation 
in [3] and [11] to introduce some fundamental concepts associated with the graph. 

For the sequence (ii,i2, ■ ■ ■ ,ik) from {1,2, ... ,p} and the sequence (ji, ■ ■ ■ ,jk) from 
{1,2, ... ,n}, we define a directed graph as follows. Plot two parallel real lines, referred to 
as I-line and J-line, respectively. Draw {i\, ii, ■ ■ ■ , ik} on the 1-line, called I-vertices and 
draw {ji, j2, • ■ • j jk} on the J-line, known as J -vertices. The vertices of the graph consist 
of the I-vertices and J-vertices. The edges of the graph are {ei, e2, . . . , &ik}i where for 
a = 1, . . . , k, e2a-i = iaja are called the column edges and e-ia = jaia+i are called row 
edges with the convention that i2k+i = *i- For each column edge e2 a -i, the vertices i a 
and j a are called the ends of the edge i a j a and moreover i a and j a are, respectively, the 
initial and the terminal of the edge i a j a . Each row edge e2a starts from the vertex jf, and 
ends with the vertex ib+i- 

Two vertices are said to coincide if they are both in the I-line or both in the J-line and 
they are identical. That is i a = ib or j a = jb- Readers are also reminded that the vertices 
i a and jb are not coincident even if they have the same value because they arc in different 
lines. We say that two edges are coincident if two edges have the same set of ends. 

The graph constructed above is said to be a W-graph if each edge in the graph coincides 
with at least one other edge. See Figure 1 for an example of a W-graph. 

Two graphs are said to be isomorphic if one becomes another by an appropriate per- 
mutation on {1,2, ... ,p} of I-vertices and an appropriate permutation on {1,2, ... ,n} 
of J-vertices. A W-graph is called a canonical graph if i a < max{ii, ii, ■ ■ ■ , i a -i} + 1 and 
ja < max{ji, j 2 , . ■ .,j a ~i} + 1 with i\ = j x = 1, where a= 1,2, . . ., k. 

In the canonical graph, if i a +i = max{ii, ii, ■ ■ ■ , i a } + 1, then the edge j a i a +i is called a 
row innovation and if j a = max{ji, j%, . . . , j a —i} + 1, then the edge i a j a is called a column 
innovation. Apparently, a row innovation and a column innovation, respectively, lead to 
a new I-vertex and a new J-vertex except the first column innovation iiji leading to a 
new I-vertex %\ and a new J-vertex j\. 
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We now classify all edges into three types, T±, T3 and T4. Let T\ denote the set of all 
innovations including row innovations and column innovations. We further distinguish 
the column innovations as follows. An edge i a j a is called a T\\ edge if it is a column 
innovation and the edge j a *a+i is a row innovation; An edge ibjb is referred to as a T\i 
edge if it is a column innovation but jbib+i is not a row innovation. An edge ej is said to 
be a T3 edge if there is an innovation edge e^i < j so that ej is the first one to coincide 
with ei. An edge is called a T4 edge if it does not belong to a T\ edge or T3 edge. The 
first appearance of a T4 edge is referred to as a T2 edge. There are two kinds of T2 edges: 

(a) the first appearance of an edge that coincides with a T 3 edge, denoted by T 2 \ edge; 

(b) the first appearance of an edge that is not an innovation, denoted by T22 edge. 

We say that an edge e 2 ; is single up to the edge ej,j>i, if it does not coincide with any 
other edges among e±,...,ej except itself. A T3 edge is said to be regular if there are 
more than one innovations with a vertex equal to the initial vertex of e, and single up to 
ej_i, among the edges {ei, . . . , ej_i}. All other T 3 edges are called irregular T 3 edges. 

Corresponding to the above classification of the edges, we introduce the following 
notation and list some useful facts. 

1. Denote by I the total number of innovations. 

2. Let r be the number of the row innovations. Moreover, let c denote the column 
innovations. We then have r + c = l. 

3. Define r\ to be the number of the T\i edges. Then r\ < r by the definition of a Tu 
edge. Also, the number of the T 12 edges is l — r — r\. 

4. Let t be the number of the T2 edges. Note that the number of the T3 edges is the 
same as the number of the innovations and there arc a total of 2k edges in the 
graph. It follows that the number of the T4 edges is 2k — 21. On the other hand, 
each T2 edge is also a T4 edge. Therefore, t < 2k — 21. 

5. Define \x to be the number of T21 edges. Obviously, fi<t. The number of T22 edge is 
then t— 11. Since each T%\ edge coincides with one innovation, we let rii, i = 1, 2, . . . , /i, 
denote the number of T4 edges which coincide with the ith such innovation, m > 0. 

6. Let fix be the number of T21 edges which do not coincide with the other T4 edges. 
That is fii = rii — 1, i = 1,2, . . . , where #{•} denotes the cardinality of the 
set {•}. 

7. Let rrij , j = 1, 2, . . . , t — /it, denote the number of T4 edges which coincide with and 
include the jth T 2 2 edge. Note that rrij > 2. 
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We now claim that 
Etr{B k v ) < (2^}- k J2E(Xi dl X i2jl ---X ikjk X iljk ) 

= (2^pT k J2 ' E " E "' E Xiih ■ ■ ■ X ikjk X iljk ) 

(11) 

k I r 2k — 21 t / \ / \ / i \ / r* i\ 

i=lr=ln=0 i=0 fi=0/Ji=0 ' \ 7 

x fc 8 *(t + l) 8fc - 6I (<y^np) 2fc - 2, - 2t+ -' 11 p r+1 n I " r . 

where the summation ' is with respect to different arrangements of three types of edges 
at the 2k different positions, the summation ^2 " over different canonical graphs with a 
given arrangement of the three types of edges for 2k positions, the third summation "' 
with respect to all isomorphic graphs for a given canonical graph and the last notation 
denotes the constraint that i\ ^ 12, i% 7^ £3, . . . ,ifc 7^ i\. 
Now, we explain why the above estimate is true: 

(i) The factor (2^[rvp)~ k is obvious. 

(ii) If the graph is not a W-graph, which means there is a single edge in the graph, 
then the mean of the product of corresponding to this graph is zero (since 
EX11 = 0). Thus, we have I < k. Moreover, the facts that r < I, r\ < r, t < 2k — 21, 
[i < t and hi < [I are easily obtained from the fact 1 to the fact 7 listed before. 

(iii) There are at most ways to choose r edges out of the k row edges to be the r 
row innovations. Subsequently, we consider how to select the column innovations. 
Observe that the definition of Tu edges, there are ( r r J ways to select ri row 
innovations out of the total r row innovations so that the edge before each such 
Vi row innovations is a Tu edge, column innovation. Moreover, there are at most 
G— r-ri) wa y s t° choose I — r — n edges out of the remaining k — ri column edges 
to be the I — r — n J12 edges, the remaining column innovations. 

(iv) Given the position of the I innovations, there are at most ( 2k ^ 1 ) ways to 
select I edges out of the 2k — I edges to be T3 edges. And the rest posi- 
tions are for the T4 edges. Therefore, the first summation ^ ' is bounded by 

1=1 ^r=l 2^5-1=0 \r) \n )\l-r-n )\ I )' 

(v) By definition, each innovation (or each irregular T3 edges) is uniquely determined 
by the subgraph prior to the innovation (or the irregular T3). Moreover, by 
Lemma 3.2 in [11] for each regular T3 edge, there are at most t + 1 innovations 
so that the regular T3 edge coincides with one of them and by Lemma 3.3 in [11] 
there arc at most 2t regular T3 edges. Therefore, there are at most (t + l) 2 * < 
(t + i) 2 ( 2fc ~ 2 ') ways to draw the regular T3 edges. 

(vi) Once the positions of the innovations and the T3 edges are fixed there are at 

most { t ) < ( k t ) < k 2t ways to arrange the t T2 edges, as there are r + 1 
I-vertices and c J-vertices. After t positions of T2 edges are determined there 
are at most t 2k ~ 21 ways to distribute 2k — 21 T 4 edges among the t positions. 
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So there are at most k 2t ■ t 2k 21 ways to arrange T4 edges. It follows that ^ " is 
bounded by £^ 2I (t + l) 2 ( 2fe - 2 <)fc 2 * . t 2k ~ 21 . 

(vii) The third summation ^ "' is bounded by n c p r+1 because the number of graphs in 
the isomorphic class for a given graph is p(p— 1) • • ■ (p— r)n(n — 1) • • ■ (n — c+ 1). 

(viii) Recalling the definitions of l,r,t, /j,, fj,i,ni,rrii, we have 

EX ilJ1 X i2jl ■ ■■X lkjk X lljk = (EXl^v- ^]^iL^™j +2 J ^JJ EX^\ , (12) 

where Yl^-, n-i + Xa=i mi = 2k — 21. Without loss of generality, we suppose ni = 
ni = • ■ ■ = Hjhj = 1 and . . . , > 2 for convenience. It is easy to check that 



E\Xn\ < 



{M{8^h~p) s 4 , it s>A,M = max{EXf 1 ,\EX% 1 \}, 
[ (5yTtp) s ~ 2 , ifs>2. 



Thus, (12) becomes 
\EXi 1 j 1 Xi 2 j 1 ■ ■ ■ Xi k j k Xi 1 j k I 

/i— /ii— 

(13) 

/1— 

t (U 

< ^(S^fnp) 2 2 _2t+ ' 11 j when fc is large enough. 

The above points regarding the T2 edges are discussed for t > 0, but they are still valid 
when t = with the convention that 0° = 1 in the term t 2k ~ 21 , because in this case there 
are only T\ edges and T3 edges in the graph and thus I = k. 

Consider the constraint now. Note that for each T12 edge, say i a ja, it is a column 
innovation, but the next row edge j a *a+i is not a row innovation. Since i a +i 7^ i a , the 
edge ja.ia+1 cannot coincide with the edge i a j a - Moreover, it also doesn't coincide with 
any edges before the edge i a j a since j a is a new vertex. So j a *a+i must be a T22 edge. 
Thus, the number of the T%2 edges cannot exceed the number of the T22 edges. This 
implies I — r — n < t — ji. Moreover, note that fix < fi. We then have 

n -fc/2 p -fc/2 n I-r p r+l (np) fc/2-«/2-t/2+fi/4 

= (nip) 1 ' 2 ■ n -r-t/2+^/A p r+l-t/2+^/i {U ) 
-ri 

■P~ t/2 P- 
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We thus conclude from (11) and (14) that 



k l r 2k~2l t /J 



Mr ( ^)<2-EEE E EE 



1=1 r=lri=0 t=0 ^=0^i=0 
r — ri 



r j \ r\ 



k — 7'i 
— r — r\ 



2k -I 



(15) 



, /£ p-^pk^it + l)M-6l S 2k-2l-2t+^ 



Moreover, we claim that 



n ) IV n 



k — r\ 

I — r — T\ 



rl-r-ri 



2k~2l 



( 2 V')(f ! )"( f+1 » a '" , ' i 



^-(;-r-ri)+3t-(2fc-2i) _ £2fc-2i-2t+pi ^g-j 



Indeed, the above claim is based on the following five facts. 

(1) 2- fe e)<2^EtoC)=l- 

(2) OiVl r ri = (vf r ri < e;= o (a/i ) s = (i + v/f ) r < (i + v/f ) fc - 

(3) L-^rM 1 '^ 1 ^ ^to 1 C'J 1 ) 68 = (! + *) fc - ri < (1 + <*)*■ 

(4) By the fact that ( 2fe ~ ') < ( 2 ;), and the inequality a"*(t + l) b < a(db) b , for a > 1, 



6 > 0, i > and ^# > ^p, we have 

( 2 Y ! )(fTW^ 



2k~2l 



< 



2k\ Vpf_ f 6fc-6l 



2Z y fe3 ^bg(^5 3 /fc 3 ) 



^ /2fc\ /24fc\ 6fc - 6 ' 2fe „ 2i 
^ /2fc\ /24 3 fc 3 ,5\ 2 ' ; - 2i 



6fc-6i 



c2k-2l 



2fc-s 



= p 1 



24 3 fc 3 ^ 21 
log 3 p 
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(5) When p is large enough, ( j-(J-f-r 1 )+3i-(2fe-2i) . 5 2A-2j-2i+ Ml = §t -(i- r - ri ) .3^ <1, 
since S — > and / — r — r± <t — /i. 

Summarizing (15) and (16), we obtain that 

24 3 Z 3 <P 2 



k I r 2k-2l t H / ; — \ 

(=lr=ln=0 t=0 ^=0^i=0 \ v / 
k 



(l + 5) k [ 1 + 



log> 



<8 fc v[i + ^V"(i + ^fi + 24W2 



n j \ log p 

^](l + ^)fl+ 24W2 



log 3 p 



<Tj fc , 



where 77 is a constant satisfying 1 < rj < 1 + e. Here the last inequality uses the facts 
below: 

(i) (p 2 ) 1 ^ -> 1, because fc/ logp— > 00, 

(ii) (S/c 6 ) 1 ^ — > 1, because fc -> 00, 
(hi) (1 + y^) -> 1, because p/n ->• 0, 

(iv) (1 + 5) -> 1, because i5^0, 

( v ) o. bccausc §l!lk 0. 

It follows that 

P(A max (B p ) > 1 + e) < (-^-) = o(p- e ) 
since Ai/logp— ^ 00 and < 1. The proof is complete. 

4. Proof of Theorem 3 

Note that 

Si = S-ss'. (17) 

By the Fan inequality [6], 

sup | F Apl (x) - F A " (x)\ < -. 
x p 

Thus from theorem in [2], we see that 

F A ^(x)^F(x), 
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specified in the introduction. It follows that 



liminf A max (A p i) > 1. 



Let z be a unit vector. In view of (17), we obtain 



z Apiz — z A^z 




z'ss'z < z'ApZ, 




5. Proof of Theorem 4 



Theorem 4 follows from Theorem 3 and the fact that 



l|S 2 -S|| = ||E 



^(Si-lpjE^HllSi-MllEII. 
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